Formant Prediction from MFCC Vectors
نویسندگان
چکیده
This work proposes a novel method of predicting formant frequencies from a stream of mel-frequency cepstral coefficients (MFCC) feature vectors. Prediction is based on modelling the joint density of MFCC vectors and formant vectors using a Gaussian mixture model (GMM). Using this GMM and an input MFCC vector, two maximum a posteriori (MAP) prediction methods are developed. The first method predicts formants from the closest, in some sense, cluster to the input MFCC vector, while the second method takes a weighted contribution of formants from all clusters. Experimental results are presented using the ETSI Aurora connected digit database and show that the predicted formant frequency is within 3.25% of the reference formant frequency, as measured from hand-corrected formant tracks.
منابع مشابه
Predicting Formant Frequencies from MFCC Vectors
This work proposes a novel method of predicting formant frequencies from a stream of mel-frequency cepstral coefficients (MFCC) feature vectors. Prediction is based on modelling the joint density of MFCCs and formant frequencies using a Gaussian mixture model (GMM). Using this GMM and an input MFCC vector, two maximum a posteriori (MAP) prediction methods are developed. The first method predict...
متن کاملFormant frequency prediction from MFCC vectors in noisy environments
This paper proposes a method of predicting the formant frequencies of a frame of speech from its mel-frequency cepstral coefficient (MFCC) representation. Prediction is achieved through the creation of a Gaussian mixture model (GMM) which models the joint density of formant frequencies and MFCCs. Using this GMM and an input MFCC vector, a maximum a posteriori (MAP) prediction of the formant fre...
متن کاملA comparison of estimated and MAP-predicted formants and fundamental frequencies with a speech reconstruction application
This work compares the accuracy of fundamental frequency and formant frequency estimation methods and maximum a posteriori (MAP) prediction from MFCC vectors with hand-corrected references. Five fundamental frequency estimation methods are compared to fundamental frequency prediction from MFCC vectors in both clean and noisy speech. Similarly, three formant frequency estimation and prediction m...
متن کاملHMM-based MAP Prediction o Formant Frequencies from N
This paper describes how formant frequencies of voiced and unvoiced speech can be predicted from mel-frequency cepstral coefficients (MFCC) vectors using maximum a posteriori (MAP) estimation within a hidden Markov model (HMM) framework. Gaussian mixture models (GMMs) are used to model the local joint density of MFCCs and formant frequencies. More localised prediction is achieved by modelling s...
متن کاملReconstructing clean speech from noisy MFCC vectors
The aim of this work is to reconstruct clean speech solely from a stream of noise-contaminated MFCC vectors, as may be encountered in distributed speech recognition systems. Speech reconstruction is performed using the ETSI Aurora back-end speech reconstruction standard which requires MFCC vectors, fundamental frequency and voicing information. In this work, fundamental frequency and voicing ar...
متن کامل